Robust Automatic Video-Conferencing with Multiple Cameras and Microphones

نویسندگان

  • Ce Wang
  • Scott M. Griebel
  • Michael S. Brandstein
چکیده

An automatic video-conferencing system is proposed which employs acoustic source localization, video face tracking and pose estimation, and multi-channel speech enhancement. The video portion of the system tracks talkers by utilizing source motion, contour geometry, color data, and simple facial features. Decisions involving which camera to use are based on an estimate of the head’s gazing angle. This head pose estimation is achieved using a very general head model which employs hairline features and a learned network classification procedure. Finally, a wavelet microphone array technique is used to create an enhanced speech waveform to accompany the recorded video signal. The system presented in this paper is robust to both visual clutter (e.g. ovals in the scene of interest which are not faces) and audible noise (e.g. reverberations and background noise).

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Audiovisual Head Orientation Estimation with Particle Filtering in Multisensor Scenarios

This article presents a multimodal approach to head pose estimation of individuals in environments equipped with multiple cameras and microphones, such as SmartRooms or automatic video conferencing. Determining the individuals head orientation is the basis for many forms of more sophisticated interactions between humans and technical devices and can also be used for automatic sensor selection (...

متن کامل

Automatic camera control using unobtrusive vision and audio tracking

While video can be useful for remotely attending and archiving meetings, the video itself is often dull and difficult to watch. One key reason for this is that, except in very high-end systems, little attention has been paid to the production quality of the video being captured. The video stream from a meeting often lacks detail and camera shots rarely change unless a person is tasked with oper...

متن کامل

Collaboration Support Using Environment Images and Videos

This paper summarizes our environment-image/videosupported collaboration technologies developed in the past several years. These technologies use environment images and videos as active interfaces and use visual cues in these images and videos to orient device controls, annotations and other information access. By using visual cues in various interfaces, we expect to make the control interface ...

متن کامل

Developmentally Appropriate Technology in Early Childhood : ‘ Video Conferencing ’ – a limit case ?

This paper originates from our desire to identify a limit case of appropriate educational applications of information and communications technology (ICT). Numerous claims have been made about the potential of technology to change the traditionally accepted developmental limits on children’s learning. Claims had also been made in the UK regarding the successful application of video conferencing ...

متن کامل

Multimodal 3-D Tracking and Event Detection via the Particle Filter

Determining the occurrence of an event is fundamental to developing systems that can observe and react to them. Often, this determination is based on collecting video and/or audio data and determining the state or location of a tracked object. We use Bayesian inference and the particle filter for tracking moving objects, using both video data obtained from multiple cameras and audio data obtain...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000